EDA on textual data

Import libraries

Define constants

Load PWDB dataset & SpaCy NLP component

Functions to calculate frequency for textual data

Function to get spaCy NLP doc for each row from series of strings

Function to get entity names from series of strings

Function to get list of words labeled with entity_name

Function to get TF-IDF for series of strings

Function to get N grams from series of strings

Function to get list of noun phrases

Function to get list of words without stop words

Function to clear textual data

Function for plot bar chart on observations

Function for plot pie chart on observations

Function to display result of textual EDA

Textual EDA for words classified with the same entity name

Textual EDA for words frequency

Textual EDA for entity names

Textual EDA for noun phrases

Textual EDA for N grams

Textual EDA for N grams without stop words

Textual EDA for top 10 TF-IDF from column

Combined textual EDA

Execute combined textual EDA on specific columns from PWDB dataset